(N 

o 

(N 



en 






c^ 



Large matchings in uniform hypergraphs and the conjectures of 

Erdos and Samuels 

Noga Alon * Peter Frankl '^ Hao Huang ^ Vojtech Rodl ^ Andrzej Ruciiiski 

Benny Sudakov" 



Abstract 



In this paper we study degree conditions which guarantee the existence of perfect match- 
/^ . ings and perfect fractional matchings in uniform hypergraphs. We reduce this problem to an 

r ^ \ old conjecture by Erdos on estimating the maximum number of edges in a hypergraph when 

the (fractional) matching number is given, which we are able to solve in some special cases us- 
ing probabilistic techniques. Based on these results, we obtain some general theorems on the 
minimum d-degree ensuring the existence of perfect (fractional) matchings. In particular, we 
asymptotically determine the minimum vertex degree which guarantees a perfect matching in 
4-uniform and 5-uniform hypergraphs. We also discuss an application to a problem of finding an 
optimal data allocation in a distributed storage system. 
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^ ! 1 Introduction 

^D , A k-uniform hypergraph or a A;-graph for short, is a pair H = {V, E), where V := V{H) is a finite 

set of vertices and E := E{H) C (^) is a family of fc-element subsets of V called edges. Whenever 
convenient we will identify H with E[H). A matching in if is a set of disjoint edges of H. The 

L^ _ number of edges in a matching is called the size of the matching. The size of the largest matching 

^ ■ in a fc-graph H is denoted by y{H). A matching is perfect if its size equals |V^|/fc. 



A fractional matching in a /c- graph H = {V^E) \s a, function w : E — t- [0,1] such that for each 
f G y we have Yle^v'^i^) — ^- Then ^eG-E'^(^) ^^ ^^^ ^^^^ °^ ^- The size of the largest fractional 
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matching in a fc-graph H is denoted by u*{H). If y*{H) = n/k, or equivalently, for all f S y we 
have X]e9^u^(e) = 1, then we call w perfect. 

The determination of i'*{H) is a linear programming problem. Its dual problem is to find a minimum 
fractional vertex cover t*{H) = J^t^gy w{v) over all functions w : F — > [0, 1] such that for each e £ E 
we have X^„ge w{v) > 1. Let t{H) be the minimum number of vertices in a vertex cover of H. Then, 
for every fc-graph H, by the Duality Theorem, 

v{H)<v*{H)=t*{H)<t{H). (1) 

Given a A;-graph H and a set 5 € (^), < d < A; — 1, we denote by degij{S) the number of edges 
in H which contain S. Let 5d '■= 5d{H) be the minimum d-degree of H, which is the minimum 
degj:^(5) over all 5 G (^). Note that 5q{H) = \E{H)\. In this paper we study the relation between 
the minimum d-degree 5d{H) and the matching numbers v{H) and v*[H). 

Definition 1.1 Let integers d,k,s, and n satisfy < d < k — 1, and < s < n/k. We denote by 
m^{k,n) the minimum m so that for an n-vertex fe-graph H, 5d{H) > m implies that iy{H) > s. 
Equivalently, 

m'a{k,n) - 1 = max{5d(if) : \V{H)\ = n and u{H) < s - 1}. 

Furthermore, for a real number < s < n/k, define f^{k,n) as the minimum m so that 6d{H) > m 
implies that i'*{H) > s. Equivalently, 

fd{k,n) - 1 = max{6d{H) : \V{H)\ = n and u*{H) < s}. 

Observe that trivially, for [s] < n/k, 

f'd{k,n)<w}^\k,n). (2) 

We are mostly interested in the case s = n/k (i.e. when matchings are perfect) in which we 
suppress the superscript in the notation m^ {k,n) and /^ {k,n). Thus, writing md{k,n), we 
implicitly require that n is divisible by k. 

Problems of this type have a long history going back to Dirac [1] who in 1952 proved that minimum 
degree n/2 implies the existence of a Hamiltonian cycle in graphs. Therefore, for d > 1, we refer 
to the extremal parameters md{k, n) and fd{k, n) as to Dirac-type thresholds. When A; = 2, an easy 
argument shows that mi(2,n) = n/2. For k > 3, an exact formula for mk-i{k,n) was obtained in 
[25] . For a fixed A; > 3 and n — )• oo it yields the asymptotics mk-i{k, n) = ^ + 0(1). As far as perfect 
fractional matchings are concerned, it was proved in [2l] that fk-i{k,n) = [n/Zc] for k > 2, which 
is a lot less than mk-i{k,n) when A; > 3. For more results on Dirac-type thresholds for matchings 
and Hamilton cycles see [23]. 

In this paper, we focus on the asymptotic behavior of md{k,n) and fd{k,n) for general, but fixed 
k and d, when n — > oo. For a lower bound on md{k,n) consider first a fc-graph Hq = HQ(k,n) 




(constructed in [26]) with vertex set split almost evenly, that is, V{Hq) = AU B, \\A\ — \B\\ < 2, 
and with the edge set consisting of all fc-element subsets of V{Hq) intersecting A in an odd number 
of vertices. We choose the size of A so that \A\ and j have different parity. Clearly, there is no 
perfect matching in Hq and for every 0<d<A; — Iwe have 6d{HQ) '^ ^{^Zd)- 

Another lower bound on mii{k, n) is given by the following well known construction. For integers 
n, fc, and s, let Hi[s) be a fc-graph on n vertices consisting of all /c-element subsets intersecting a 
given set of size s — 1, that is Hi{s) = Kn — -K"„_s4.i. Observe that v{Hi{s)) = s — 1, while 

(n-d\ fn-d-n/k + l 

Assume that n is divisible by k. Putting s = ^ and using the fc-graphs Hq and Hi{n/k), we obtain 
a lower bound 

m,(fc,n)>max{<5d(iJo),<5d(i/i(f))}+l~max|l,l-(^) \ I^^Zfj ■ (3) 

On the other hand, Hidn/kl) alone yields a lower bound also on fd{k, n). Indeed, for a real s > 
we have 

v*{H,{\s^)) = r*{H,{\s^)) < t{H^{\s])) = \s]-1<s, 

and so 

It is easy to check that for d > k/2 the maximum in the coefficient in ([3|) equals g- Pikhurko 
[22] proved, complementing the case d = k — 1, that indeed we have md{k,n) ~ ^(^1^) ^Iso for 
k/2<d<k-2,k>4.. 

For d < k/2 the problem seems to be harder and we discuss below the cases d > 1 and d = 
separately. The first result for the range 1 < d < k/2, k > 3, was obtained already in 1981 by 
Daykin and Haggkvist in [3] who proved that mi{k,n) < (^^ + o(l)) (^Zi)- This was generalized 
to md{k,n) < (^ + o(l)) (IZd) for ah 1 < d < k/2 in [TO], and, using the ideas from [TO], slightly 
improved in [20] to md{k,n) < [^ - -^ + o(l)} (^I^). Yoi k = A,d = I the latter coefficient 
is gj. In |20j . the constant was further lowered to gj, but there is still a gap between this upper 
bound and the lower bound of §?. 
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It has been conjectured in [15] and again in [TO] that the lower bound ^ is achieved at least 
asymptotically. 



Conjecture 1.1 For all 1 < d < k — 1, 



md{k,n) ~ max<( -,1 



1^ f k — \\' \ fn — d 



2' \ k \k-d 



Han, Person, and Schacht in [10] proved Coniecture ll.il in the case d = 1, k = 3 hy showing that 
mi(3, n) is asymptotically equal to §("2 )■ Kiihn, Osthus, and Treglown [16] and, independently. 
Khan [13], proved the exact result mi(3,n) = 5i{Hi{n/3)) + 1. Recently Khan [TJ] announced that 
he verified the exact result mi(4, n) = 5i{Hi{n/4)) + 1, while the asymptotic version, mi(4, n) ~ 
M ("3 ) follows also from a more general result by Lo and Markstrom [19] . 

These exact results, together with ^ and (JH), yield that /i(3, n) = mi (3, n) and /i(4, n) = mi (4, n). 
Remembering that, on the other hand, fk~i{k,n) is much smaller than mk~i{k,n), one can raise 
the question about a general relation between md{k,n) and its fractional counterpart fd{k,n). In 
this paper we answer this question by showing that md{k,n) and fdik,n) are asymptotically equal 
whenever fd{k,n) ~ c*(^~^) for some constant c* > i, and otherwise md{k,n) ~ ^(^Z^)- 



Theorem 1.1 For every 1 < d <k — 1 if there exists c* > such that fd{k,n) ~ c*(^_^) i/ien 

mrf(A;,n) ~max{c*,i} f _ j. (5) 

This result reduces the task of asymptotically calculating md{k, n) to a presumably simpler task of 
calculating fd{k,n). It seems that, similarly to the integral case, the lower bound in (J3D determines 
asymptotically the actual value of the parameter fd{k,n). 



Conjecture 1.2 For all 1 < d < k — 1, 

fd{k,n)^ll 



k — 1\ \ fn — d 



k I \ \k — d 



Our next result confirms Conjecture 1 1.21 asymptotically for all k and d such that 1 < /c — d < 4. Note 
that the above mentioned result from [23] shows that Conjecture 11.21 is true for d = k — 1 exactly, 
that is, fk-i{k,n) = 5k~i {Hi (ffl)) + 1- We include this case into the statement of Theorem 11.21 
for completeness. 



Theorem 1.2 For every k > 3 and k — 4:<d<k — 1, we have 

fd{k,n)^\l- 



k — 1\ \ fn — d 



k J \ \k — d 

Theorems 11.21 and 11.11 together imply immediately the validity of Conjecture [LTj in a couple of new 
instances (as discussed earlier, the first of them has been recently also proved in |14j and |19j). 

Corollary 1.1 We have 



We prove Theorem 1 1 . 21 utilizing the following connection between the parameters /J(A;, n) and /g (fc — 
d,n — d). 

Proposition 1.1 For all k >3, 1 < d < k — 1, and n > k, 

fdik, n) < f^'^ik - d,n- d). 

In view of Proposition ll.il in order to prove Theorem 11.21 we need to estimate /q (/c — d, n — d) with 
s = J. This is trivial for d = k — 1 and so, from now on, we will be assuming that d < k — 2. The 
integral version of this problem has almost as long history as the Dirac-type problem {d > 1). 

Erdos and Gallai [B] determined mQ{k,n) for graphs {k = 2). In 1965, Erdos |3| conjectured the 
following hypergraph generalization of their result. 

Conjecture 1.3 For all k >2 and I < s < j: 

ml{k, n) = max I ( ' ~ M , ( "J ) - ( "^ " ' + M I + 1. 



r\j / \ "^ / \ ''' 

The lower bound comes from considering again the extremal /c-graph Hi{s) along with the A;-uniform 
clique Kj^J_-^ (complemented by n — fcs + 1 isolated vertices) which, clearly, has no matching of size 
s. For more on Erdos' conjecture we refer the reader to the survey paper [7] and a recent paper [9], 
where the conjecture is proved for k = 3 and n > 4s. In its full generality, the conjecture is still 
wide open. 

We now formulate the fractional version of Erdos' Conjecture. For future references, we switch from 
k and n to / and m. Again, the lower bound is yielded by ffi([s]) and the complete l-graph on 
\ls] — 1 vertices, Kr^l,-._^. 

Conjecture 1.4 For all integers I > 2 and an integer s such that < s < m/l, we have 

Note that Conjecture 11.41 implies that the bound is also asymptotically true for non-integer values 
of s, when m is large. In [18], there is an example showing that the stronger, precise version of the 
conjecture does not hold for fractional s. 

As a consequence of the Erdos-Gallai theorem from [6], Conjecture 11.41 is asymptotically true for 
/ = 2 and m goes to infinity. In the next section we establish a result which confirms Conjecture 
11.41 asvmptoticallv in the two smallest new instances, but limited to the range < s < -^. In this 
range the case I = 3 follows also from the above mentioned result in [9]. It is easy to check that for 
s < j^ + 0(1), the maximum in Conjecture 11.41 is achieved by the second term. 



Theorem 1.3 For I € {3,4}, for all d>l, and s = ^, 



m»)~{i-(i-y^)'}( 



1 \ I /m 

I 



where the asymptotics holds for m — > cx) with d fixed. 



Theorem 11.31 together with Proposition 11.11 iinphes Theorem 11.21 which, in tm'n, together with 
Theorem 11.11 yields Corohary II. 1[ To prove Conjecture 11.11 in fuh generahty, one would need to 
prove Theorem 11.31 for all I. 

The rest of this paper is organized as follows. In the next section, we prove Theorem 11.31 using as 
a main tool a probabilistic inequality of Samuels. A proof of Proposition ll.il and consequently of 
Theorem 11.21 appears in Section [3l Section U] contains a proof of Theorem 11.11 Finally, in Section 
[5l we discuss an application of the fractional version of the Erdos problem in distributed storage 
allocation. The last section contains concluding remarks and open problems. 



2 Fractional matchings and probability of small deviations 

In this section we prove Theorem 11.31 using a probabilistic approach from [I] based on a special case 
of an old probabilistic conjecture of Samuels [27j- In fact, we prove a little bit more - see Corollary 
12.11 and Remark 12.11 below. 

For / reals /xi, ■ ■ ■ , fj,i satisfying < /ii < /i2 < • • • < A*; and Yli=i fJ-i < 1, let 

P(/ii,/i2, . . . , W) = inf P(Xi + . . . + X; < 1), 

where the infimum is taken over all possible collections of / independent nonnegative random vari- 
ables Xi, . . . ,Xi, with expectations fii, . . . , fii, respectively. Define 



Qt{^^l,...,Hl) = Yl 1~ 7" 
i=t+i \ 



IJ'i 



for each < t < /. 

Note that QtifJ'i, • • • , M?) is exactly F{Xi + . . . + Xi < 1) when Xi is identically m for all i < t, while 
Xi attains the values and 1 — '^i<:tfJ'i (with its expectation being ^Uj) for all i > t + 1. 

The following conjecture was raised by Samuels in [27j . 



Conjecture 2.1 ( [27j ) For all admissible values of fii, . . . , fii, 

P{lJ.l,fl2, • • • , w) = ^_™^']_^ Qtil^i, /W2, • • • , w)- 

Note that for I = 1 this is Markov's inequality. Samuels proved his conjecture for / < 4 in 



Lemma 2.1 ( [27|, I28j ) T/ie assertion of Conjecture \2.1\ holds for all I < A. 

We next show that for jj^i = fi2 = ' ' ' = l^i = x, where < x < jj^, the minimum in Coniecture 12.11 
is attained by Qo{ni, . . . , m). 

Proposition 2.1 For every integer I > 2 and every real number x satisfying < x < jj^, if 
fJ-i = fJ'2 = ■ ■ ■ = IJ-l = X then 

^_min_^Qt(/Ui,/X2,... , fJ-i) = Qo{fii, fi2, ■ ■ ■ , fJ-i) = (1 -xY- 
Proof: By definition 

We thus have to prove that for < x < j—j and 1 < t < / — 1, 



1-tx 

or equivalently that 

1 \' / 1-tx \i-t 
> 



.1-xJ ~ \l- {t + l)x. 
By the geometric-arithmetic means inequahty apphed to a set of / numbers, t of which are equal to 
1 and the remaining / — t equal to the quantity i_,J^\^ , we conclude that 

1- {t + l)xJ ~ U V l-(t+l)a; 



Thus, it suffices to show that 



{l-tx){l-t) ^^^ I 



1 - (t + l)x ~ 1- x' 

This is equivalent to 

(1 - x)[(l - tx){l -t) + t-t{t + l)x\ < l[l -{t + l)x], 

which is equivalent to 

(1 - x)[l - t{l + l)x] <l-l{t + l)x, 

or 

l-t{l + l)x -lx + t{l + l)x'^ <l-l{t + l)x. 

After dividing by x, we see that this is equivalent to x < j^, which holds by assumption, completing 
the proof. ■ 

Note that when s = xm and x < j—j, the maximum in Coniecture 11.41 is achieved by the second 
term. We now prove the following, in most part conditional, result, which shows how to deduce 
Conjecture 11.41 in this range from Conjecture 12.11 



Theorem 2.1 For any I > 3 and < x < iq;:^, if Conjecture \2.1\ holds for I and ni = /U2 
m = X then 



frii,m) 



{i-a-.)'}(7). 



Combining Theorem 12.11 with Lemma |2.H we obtain the following corollary which implies Theorem 
[Ql (For d = l, observe that fSil,m) ~ f^{l,m + 1).) 

Corollary 2.1 For I = 3, x < 1/4 and for / = 4, x < 1/5, i/ie maximum number of edges in an 
l-uniform hypergraph H on m vertices with fractional matching number less than xm is 



/o^'"(/,m)~{l-(l-x/} 



Proof of Theorem \2.1\ Let H be an ^-uniform hypergraph on a vertex set V , \V\ = m, and 
suppose that v*{H) < xm. By duality, we also have t*{H) < xm, and hence there exists a weight 
function w : V ^- [0,1] such that J2v&v ""^l^) < ^"t- ^^d, for every edge e of H, X^^gg w;(t') > 1. By 
increasing the weights w{v) if needed, we may assume that 

m 

y w{v) = xm. 
vev 

Let vi, . . . ,vihe a sequence of random vertices of H, chosen independently and uniformly at random 
from V. For each i = 1, . . . , / we define a random variable Xi = w{vi). Note that Xi, X2, . . . ,Xi are 
independent, identically distributed random variables, where every Xi attains each of the m, values 
w{v) with probability 1/m,. (The values of w for different vertices can be equal, but this is of no 
importance for us.) 

By definition, the expectation fii of each Xi is 

El xm, 
— .w{v) = = x. 
m m 

Now we can estimate the number of edges of H as follows. Since for each edge of H we have 
Tlv&e '^{^) — 1) the number N of all /-element subsets S oiV with X^„g5 w{v) < 1 is a lower bound 
on the number of non-edges of H. Let A'^i and A'^2 be the numbers of all /-element sequences of 
vertices of V and all /-element sequences of distinct vertices of V, respectively, with the sums of 
weights strictly smaller than 1. Note that A'^i — N2 is at most the number of /-element sequences in 
which at least one vertex appears twice, thus it is bounded by (^^tu = 0{m ). As the number 
of ah /-element subsets of V is (7) = (1 + o{l))m'-/U and N = N2/l\, we have 

N, N2 + 0{m}^ N 




If Coniecture 12.11 holds for a given I then, by Lemma |2. II and Proposition 12.1 



Y,w{vi)<l\ =p[^X, <lj >(l-x)^ 



and, consequently. 



N > {1 + o{l)){l - xY (' 
It follows that the number of edges of H is at most (1 + o(l)) |l — (1 — x) } (Y), as needed. 
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Remark 2.1 Note that the above proof works as long as the conclusion of Proposition 12.11 holds. 
One can check using Mathematica that Proposition 12.11 holds for I = 3 and all < x < 0.277, as 
well as for I = 4 and all < x < 0.217. Therefore, Corollary 12.11 extends to these broader ranges of 
X. For bigger values of x, e.g., for x = 0.3 when / = 3, this is not the case anymore, and the above 
method does not suffice to determine the asymptotic behavior of /q'^(/,?7i). In fact, using Samuels 
conjecture in the higher range of x, one gets a bound on fQ^{l,m) which is larger than that in 
Conjecture II. 4[ However, in view of Proposition 11.11 for our main application the case x < |t:i is 
just what we need. 



3 Thresholds for perfect fractional matchings 

In this section we present a proof of Proposition 11.11 and then deduce quickly Theorem 11.21 

Proof of Proposition \l.ll The outline of the proof goes as follows. We will assume that there is 
no fractional perfect matching in a /c-graph H on n vertices and then show that the neighborhood 
graph H{L) in i7 of a particular set L of size d satisfies i'*{H{L)) < n/k. This will imply that 
^d{H) < \H{L)\ < /q {k—d,n—d). In contrapositive, we will prove that if (5rf(ii') > /q {k—d,n—d) 
then H has a fractional perfect matching, from which it follows, by definition, that /d(/c, n) < 
fQ'^{k-d,n-d). 

Let an n- vertex /c-graph H satisfy i'*{H) < n/k, that is, have no fractional perfect matching. As 
T*{H) = u*{H), there is a function i(j : 1/ — ;■ [0, 1] such that Ylv&v ^(^) < ^/^ and, for every e G H, 
we have X^^gg w{v) > 1. We can replace H with the /c-graph whose edge set consists of every /c-tuple 
of vertices on which w totals to at least one. 

Formally, for every weight function tf : y — t- [0, 1] define 

For a given weight function w, suppose L is a set of d vertices with the smallest weights. Without 
loss of generality, we may assume that the d lowest values of w{x) are all equal to each other, 
since otherwise we could replace them by their average. (Obviously, this modification would not 

9 



change J2v&v "^i^) ^"-"^ ^^^ ^^^ -^O ^ot^ that the minimum d-degree 6d{Hw) = min^ /v\ degj:^(5) 
is achieved when S = L. Let H{L) be the neighborhood of L in if^, that is a (A; — d)-graph on the 
vertex set V \L and with the edge set 

Then \H[L)\ = 5d{Hyj) and it remains to prove that t*{H{L)) < n/k. 

Let wq = min^gy w{v) and observe that wq < 1/k. If wq > 0, apply to the weight function w the 

following linear map 

/ w-wq 



w 



1 — kwQ 

Then, still ^^^yw'iv) < n/k and Hu, = H^i- Moreover, for every v € L, we have w'{v) = 0. It 
follows that the function w' restricted to the set V \L \s a, fractional vertex cover of H{L) and so 
v*{H{L)) = T*{H{L)) < n/k, which completes the proof of Proposition ll.il ■ 

Proof of Theorem \1.2i As explained earlier, /g {k — d,n — d) = n/k holds trivially for d = k — 1 
and together with Proposition 11.11 implies the theorem in this case. For d = k — 2, we apply 
Proposition 11.11 together with the case / = 2 of the fractional Erdos Conjecture 11.41 (as mentioned 
earlier, it follows asymptotically from [6]). For d = k — 3 and d = A: — 4, we use Proposition 11.11 and 
Corollary 11.31 proved in the previous section. ■ 

Remark 3.1 Consider a restricted version of Samuels' problem to minimize P(Xi + ■ ■ ■ + Xi < 1) 
under the additional assumption that all random variables are identically distributed. Our proofs 
indicate that under this regime, for a given / > 5 and fJ-i = ■ ■ ■ = mi = x < j^-^, if 

P(Xi + • • • + X; < 1) > (1 + 0(1))(1 - xf 

then Theorem 11.21 would hold for all A; > / + 1 and d = k — I. 

4 Constructing integer matchings from fractional ones 

In this section, we will prove Theorem ll.il An indispensable tool in our proof is the Strong Absorbing 
Lemma [4. II from |10| (see Lemma 10 therein). This lemma provides a sufficient condition on degrees 
and co-degrees of a hypergraph ensuring the existence of a small and powerful matching which, by 
"absorbing" vertices, creates a perfect matching from any nearly perfect matching. 

Lemma 4.1 For all j > and integers k > d > there is an no such that for all n > no the 
following holds: suppose that H is a k-graph on n vertices with 5(i{H) > (l/2 + 27)(^~^), then there 
exists a matching M := Mats in H such that 

(i) \M\ < j^n/k, and 

10 



(ii) for every set W C V \ V{M) of size at most \W\ < 7^*^71 and divisible by k there exists a 
matching in H covering exactly the vertices ofV{M) U W. 

Equipped with this lemma we can practically reduce our task to finding an almost perfect matching 
in a suitable subhypergraph of H. Here is an outline of our proof of Theorem 11.11 Assume that 
there exists a constant < c* < 1 such that fd{k,n) ~ c*(^~^). For any a > consider a fc-graph 
H on n vertices, where n is sufficiently large, with 

where c = max{2,c*}. Our goal is to show that H contains a perfect matching. 
Set 7 = a/2 and e = 7^*^. The proof consists of three steps. 

1 . Find an absorbing matching Mabs satisfying properties (i) and (ii) of Lemma 14.11 Set H' = 
H \ V{Mabs) and note that when n is sufficiently large, 

«-') >- «-) - ((::.0 - (";-7l) ^ (-"/<r.O - '-^'(::.' 

2. Find a matching Maim in H' such that \V{Maim)\ > (1 - £)\y{H')\, and thus, \V{Maim U 
Mabs)\>{l-e)n. 

3. Extend Maim U Mabs to a perfect matching of H by using the absorbing property (ii) of Mabs 
with respect to 1^ = V{H') \ V{Maim)- 

Now come the details of the proof. The Strong Absorbing Lemma provides an absorbing matching 
Mabs, SO Steps 1 and 3 are clear. Hence to complete the proof of Theorem 1 1.1 1 it remains to explain 
Step 2. One possible approach to find an almost perfect matching in H' is via the weak hypergraph 
regularity lemma. Our proof, however, is based on Theorem 1.1 in [8]. Recall that the 2-degree of a 
pair of vertices in a hypergraph is the number of edges containing this pair. An immediate corollary 
of that theorem asserts the existence of an almost perfect matching in any nearly regular fc-graph 
in which all 2-degrees are much smaller than the vertex degrees. (See Remark after Theorem 1.1 in 
[8] or Chapter 4.7 of [2]). Here we formulate this corollary as the following lemma in which A.2{H) 
denotes the maximum 2-degree in H. 

Lemma 4.2 For every integer k > 2 and a real e > 0, there exists t = T{k,e), d^ = dQ{k,e) such 
that for every n > D > do the following holds. 

Every k-uniform hypergraph on a set V of n vertices which satisfies the following conditions: 

1. (1 - t)D < degj;^(u) < (1 + t)D for all v eV, and 

2. A2{H) < tD 
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contains a matching Maim covering all but at most en vertices. 

Hence, Step 2 above reduces to finding a spanning subhypergrapli H" of H' satisfying the assump- 
tions of Lemma 14.21 with e = 7^*^ and other parameters T,D,a to be suitably chosen. Indeed, 
the following claim is all we need to complete the proof of Theorem 11.11 For convenience, we set 
n := \V{H')\. Recall that c = max{2,c*} where c* comes from the threshold which guarantees the 
existence of fractional perfect matchings. 

Claim 4.1 For sufficiently large n, any k-graph H' on n vertices satisfying 6d{H') > (c + 7)(^Zrf) 
contains a spanning subhypergraph H" , such that for all v € V{H") we have degj^//(f) ~ n^-"^ while 
A2{H") < n°-i. 

Consequently for every k > 2, e > 0, the subhypergraph H" satisfies the assumptions of Lemma 14.21 
with D = n^''^, and any r > 0. We obtained the following result as an immediate corollary, which 
asserts the validity of Step 2 and completes our proof of Theorem 11.11 

Corollary 4.1 H' contains an almost perfect matching covering at least (1 — e)|y(//')| vertices. 

In the proof of Claim 14. 1|, the following well-known concentration results (see, for example [2], 
Appendix A, and Theorem 2.8, inequality (2.9) and (2.11) in |12j ) will be used several times. We 
denote by Bi(n,p) a binomial random variable with parameters n and p. 

Lemma 4.3 (Chernoff Inequality for small deviation) If X = X][Li^j> each random variable Xi 
has Bernoulli distribution with expectation pi, and a < 3/2, then 

F{\X - EX\ > aEX) < 2e"'^'^^ (6) 

In particular, when X ~ Bi{n,p) and A < |np, then 

P(|X - np\ > A) < e-f^(AV(np)) (7) 

Lemma 4.4 (Chernoff Inequality for large deviation) If X = X]"^^ Xi, each random variable Xi 
has Bernoulli distribution with expectation pi, and x >7 KX , then 

¥{X >x)< e"^ (8) 

Proof of Claim \4-1\ The desired subhypergraph H" is obtained via two rounds of randomization. 
In the first round, we find edge-disjoint induced subhypergraphs with large minimum degrees which 
guarantees the existence of perfect fractional matchings. In the second round, we construct H" from 
these fractional matchings. 

As a preparation toward the first round, R is obtained by choosing every vertex randomly and 
independently with probability p = \V'\~^'^ = n~^''^. Then \R\ is a binomial random variable with 
expectation n^'^. By inequality ([7]), \R\ ~ ?i^'^ with probability 1 — e~^(" '. 
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Fix a subset D C V' oi size d and let DEGd be the number of edges f £ H' such that D C f and 
f \ D (^ R, which is the number of edges e in the hnk graph H[D] with ah of its vertices in the 
random set it!. Therefore DEGd = X^gg/^mi Xe, where Xg = 1 if e is in i? and otherwise. We have 

E(DEGfl) = degH> (D) x (n^O-^)'^-'^ > (c + a/2) ("^ ~ ^"j n-O-^^^-^) 



>(c + a/3)('^"_-^')=0(n°-i(^-'^)) 



For two distinct intersecting edges Cj, ej with jej fl ej| = I for 1</<A; — d — 1, the probabiUty that 
both of them are in R is 

P(Xe, = Xe^ = 1) = p2{fc-d)-« 

For fixed I, there are at most (^Z^) choices for e^ in the hnk graph H[D], ( 7 ) ways to choose the 
intersection L = e^ n Cj of size /, and { i^SJli~ ) options for ej\L. Therefore, 



k-d-l 

By Janson's inequahty (see Theorem 8.7.2 in [2]), 

P(DEGz) < (1 - a/12)E(DEGz))) < g"^"^^)'/^) ~ g-^^""'') 

Therefore by the union bound, with probabihty 1 — n e~^" -*, for all subsets D CV' of size d, we 
have 

DEGd > (1 - a/12)E(DEGfl)) > (c + a/4) f '^' ~ f 

\ A; — a 

Take n^'^ independent copies of R and denote them by i?*, 1 < i < n^'^, and the corresponding 
random variables by DEG)^ , where D C V', \D\ = d, and i = 1, . . . ,n"^'^. Since \Ri\ ~ n'''^ with 
probability 1 — e"^" ' ' for each z, the union bound ensures that \Ri\ ~ nP'^ for every i = 1, • • • , n^'^ 
with probability 1 — o(l). Now for a subset of vertices S C V' , define the random variable 

Ys = \{i : S Q R'}\. 

Note that the random variables Ys have binomial distributions Bi{n^'^,n~^'^' ') with expectations 
^i.i~o.9\S\_ In particular, for every vertex v G V, Yr^,j '--^ i?i(?i-'^'^, n~'^'^) and EYr^i = n^-'^. Hence, 
by inequality ([7]), taking A = n^'^^, 

,0.15\2/„0.2\ 



p(|y^^,^ _ „0.2| > ^0.15) < ^_f,((„u.x5)./„u..) ^ ^_^(„u.x) 

W 
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Therefore a.a.s {Yt^y — nP''^\ < n^-^^ for every vertex v € V. 



Further, let 



{u,v}e (^') :yK.}>3| 



Then 

EZ2<n2(ni-i)3(n-0-9)6 = n-0-i. 

Therefore by Markov's inequahty, 

W{Z2 = 0) = 1 - P(Z2 > 1) > 1 - EZa > 1 - 71"°-^ 

This imphes that a.a.s every pair of vertices {u,v} is contained in at most two subhyper graphs i?*. 
Finahy, for A; > 3, let 



Then, 



S€[^j^]:Ys>2 



Similarly, 



'{Zk = 0) > 1 - n- 



-0.2 



The latter implies that a.a.s. the induced subhypergraphs //[i?*], i = 1, . . . ,n^'^, are pairwise 
edge-disjoint. Summarizing, we can choose the sets i?*, 1 < i < n^'^ in such a way that 



(i) for every v G V , Y{„} ~ 



n 



0.2 



(ii) every pair {u,v} C y is contained in at most two sets i?', 

(iii) every edge e € H is contained in at most one set i?*, 

(iv) for all i = 1, . . . ,n^'^, we have |i?*| ~ nP'^, and 

(v) for aU i = 1, . . . , n^-^ and ah £) C y' , |Dj = d, we have DEgS^^ > (c + a/4) (Ifl"'^) . 

Let us fix a sequence i?*, 1 < i < n^'^, satisfying (i)-(v) above. 

Our assumption that fd{k, n) ~ c* (^Z^) holds for all sufficiently large values of n, in particular with 
n replaced by \R^\ ~ n°'^. Thus, we have 

/A|ff|)~c-(lfl-/ 

and, by condition (v) above, we conclude that 

5a{H[R% > (c + «/4)('^l"/) > Uk, m. 

Consequently, by the definition of fd, there exists a fractional perfect matchings w^ in every subhy- 
pergraph H[R^], i = 1,. . . ,n^-^. 
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Now comes the second round of randomization. Let H* = yj^H[BJ']. We select a generalized 
binomial subhypergraph H" of H* by independently choosing each edge e with probability w*^(e), 
where ie is the index i such that e € H[R^]. Recall that property (iii) ensures that every edge is 
contained in at most one hypergraph i?*, which guarantees the uniqueness of ig- We are going to 
verify our claim by showing degfjii{v) ~ n'^'^ for any vertex v, while /S.2{H") < n^'^. 

Let ly = {i : V £ R^} and recall that \Iy\ = Yr^i rsj nO-2 ^y (^["j^ Yoi every v £ V the set E^, of edges 
e € H* containing v can be partitioned into \Iy\ parts E^ = {e G E^, D H[R''']}. Recall that w^ is a 
perfect matching, and thus X^gg^;! w^{e) = 1. For every v gV the random variable D^ = degfj/,{v) is 
equal to X^jgj SeeE* ^e' '^^lere X^ are independent random variables having Bernoulli distribution 
with expectation u;*=(e). Therefore Dy is generalized binomial with expectation 






nO-2. 



Hence by Chernoff's inequality (j6]), 

Set a = n~^'^^, then |D^ — n'^'^l < n^-^^ with probability 1 — 0(e~" ). Taking a union bound over 
all the n vertices, we conclude that a.a.s. for all v G V' we have Dy ~ n*^'^. 

Moreover, for all pairs u,v £ V the random variable Du,v = d^SH" W^ ^) is also generalized binomial 
with expectation 

by (ii). Hence, again by Chernoff's inequality ([8]) for large deviations, when n is sufficiently large. 
Once again taking the union bound ensures that a.a.s. for every pair of vertices u,v (z V, D^^y < 



5 An application in distributed storage allocation 

The following model of distributed storage has been studied in information theory |17^ [2H [25] . A 
file is split into multiple chunks, replicated redundantly and stored in a distributed storage system 
with n nodes. Suppose the amount of data to be stored in each node i is equal to Xi, where the size 
of the whole file is normalized to 1. In reality, because there is limited storage space or transmission 
bandwidth, we require that the total amount of data stored does not exceed a given budget T, i.e. 
xi + • • • + x„ < T. At the time of retrieval, we attempt to recover the whole file by accessing only the 
data stored in a subset R of r nodes which is chosen uniformly at random. It is known that there 
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always exists a coding scheme such that we can recover the file whenever the total amount of data 
accessed is at least 1. Our goal is to find an optimal allocation (xi, ■ ■ ■ ,x„) in order to maximize 
the probability of successful recovery. This problem can be reformulated as follows. 

Question 5.1 For a sequence of nonnegative numbers (xi, • • • ,Xn), let 

^{xi, ■ ■ ■ ,Xn) = {•S C [n], |5| = r such that /^Xi > l} 

ies 

Then the probability of successful recovery of the file equals 

$(xi,--- ,Xn) 

o ■ 

Given integers n > r > 1 and a real number T > 0, determine 

F (r,n) = max <l>(xi,--- ,Xn)- 

and find an allocation optimizing F'^ (r, n) . 

In this section, we always assume that T is integer-valued in order to avoid any rounding issues. If 
the total budget T is at least n/r then, by setting all Xi = T/n > 1/r for all i, we can recover the 
original file from any subset of size r. So, F'^{r,n) = ("") for T > n/r. For T < n/r, let w[i) = Xi be 
a weight function from V = [n] to R. Then by the definition of the threshold r-uniform hypergraph 
H^ from Section [3l the edges of H^^ correspond to the r-subsets S such that YlieS ^i — ^- Thus, it 
is easy to see that the fractional matching number of H^ satisfies 

n n 

while 

$(X1,--- ,Xn) = \Hl\. 

Therefore, F {r, n) is the maximum number of edges in an r-uniform hypergraph on n vertices with 
fractional matching number at most T. As such F [r, n) differs from /g (r, n) only in that the latter 
has the strict inequality i'*{H) < T in its definition. But, of course, we have fQ{r,n) < F'^{r,n) < 
/q "*" (r, n), and so F'^{r, n) ~ /(f (r, n) as n — )• cx). 

Hence, Question 5.1 is asymptotically equivalent to the fractional Erdos Conjecture 11.41 As men- 
tioned in the introduction, it follows from the Erdos-Gallai theorem [S] that 

F^(2, n) ~ g{2, n) ~ m^(2, n) ~ max | (^^^ ' (2) " (" 2 ^ 

An easy calculation shows that the above maximum equals the first term if gn < T < ^n, and the 
corresponding optimal graph is a clique of size 2T. This means that, asymptotically, an optimal 
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allocation is xi = ■ ■ ■ = X2t = 1/2 and X2T+1 = ■ ■ ■ = Xn = ^- On the other hand, if T < |n, an 
optimal allocation is xi = • • • = x-p = 1 and xt+i = • • • = x^ = 0. 

For general r > 3, if Conjecture 11.41 is true, then 

rp { frT\ fn\ fn — T^ 

F {r,n) ~ max 






The bounds are achieved when if is a clique or a complement of clique. A corresponding (asymp- 
totically) optimal storage allocation is xi = • • • = XrT = '^/r, x^t+i = • • • = x„ = or xi = 
••• = XT = 1,xt+i = • • • = x„ = 0, respectively. Corollary 12.11 and Remark 12.11 assert that 
for r = 3 and T < 0.277 n, as well as for r = 4 and T < 0.217 n, the latter is an opti- 
inal allocation. Moreover, if Samuels' conjecture 12.11 holds for all the remaining r > 5, then 
xi = • • • = XT = 1,xt+i = • • • = x„ = is always an asymptotic optimal allocation whenever 
T < n/{r + 1). Erdos [5] proved Conjecture 11.31 for all T < n/(2r^). Recently, the authors of 
|llj extended the range for which this conjecture holds to T = 0{n/r'^). Therefore, in this range, 
F {r,n) is achieved by the complement of a clique and an optimal allocation is also known to be 
xi = • • • = XT = 1, XT+i = ■■■ = Xn = 0. 

6 Concluding Remarks 

• We have studied sufficient conditions on the minimum d-degree which guarantee that a uniform 
hypergraph has a perfect matching or perfect fractional matching. We proved that if fd{k,n) ^^ 
c* (^) , then m(i{k, n) '^ max{c*, 1/2} (^) . Therefore in order to determine the asymptotic behavior 
of the minimum d-degree ensuring existence of a perfect matching, we can instead study the 
presumably easier question for fractional matchings. Using this approach we showed, in particular, 
that mi(5,n)~(l-|^) ("-!). 

• An intriguing problem which remains open is the conjecture by Erdos which states that the 
maximum number of edges in a A;-uniform hypergraph H on n vertices with matching number 
smaller than s is exactly 

^ks — 1\ fn\ fn — s + 1^ 



™-' k /' \kl V k 



The fractional version of Erdos conjecture is also very interesting. In its asymptotic form it says 
that if H is an Z-uniform m- vertex hypergraph with fractional matching number i'*{H) = xm, 
where < x < 1//, then 



\H\ < (1 + 0(1)) max {(/x)', 1 - (1 - x)'} h] . 



In Section[2]we showed that the fractional Erdos conjecture is related to a probabilistic conjecture 
of Samuels. This conjecture, if proved, will provide a solution to the fractional version of Erdos 
problem for the range x < j^. It will also lead to the asymptotics of mii{k,n) and fd{k,n) for 
arbitrary k > d+\ and d>l. 
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• As it turns out, matchings and fractional matchings also have some interesting applications in 
information theory. In particular, the uniform model of distributed storage allocation considered 
in [29] leads to a question which is asymptotically equivalent to the fractional version of Erdos' 
problem. In [T7], the set of accessed nodes, R, is given by taking each node randomly and 
independently with probability p. It would be interesting to see if our techniques can be applied 
to study this binomial model too. 
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